Semi-greedy heuristics for feature selection with test cost constraints
نویسندگان
چکیده
In real-world applications, the test cost of data collection should not exceed a given budget. The problem of selecting an informative feature subset under this budget is referred to as feature selection with test cost constraints. Greedy heuristics are a natural and efficient method for this kind of combinatorial optimization problem. However, the recursive selection of locally optimal choices means that the global optimum is often missed. In this paper, we present a three-step semi-greedy heuristic method that directly forms a population of candidate solutions to obtain better results. In the first step, we design the heuristic function. The second step involves the random selection of a feature from the current best k features at each iteration. This is the major difference from conventional greedy heuristics. In the third step, we obtain p candidate solutions and select the best one. Through a series of experiments on four datasets, we compare our algorithm with a classic greedy heuristic approach and an information gain-based k-weighted greedy heuristic method. The results show that the new approach is more likely to obtain optimal solutions.
منابع مشابه
Effective heuristics and meta-heuristics for the quadratic assignment problem with tuned parameters and analytical comparisons
Quadratic assignment problem (QAP) is a well-known problem in the facility location and layout. It belongs to the NP-complete class. There are many heuristic and meta-heuristic methods, which are presented for QAP in the literature. In this paper, we applied 2-opt, greedy 2-opt, 3-opt, greedy 3-opt, and VNZ as heuristic methods and tabu search (TS), simulated annealing, and pa...
متن کاملImproved algorithms for multiplex PCR primer set selection with amplification length constraints
Numerous high-throughput genomics assays require the amplification of a large number of genomic loci of interest. Amplification is cost-effectively achieved using several short single-stranded DNA sequences called primers and polymerase enzyme in a reaction called multiplex polymerase chain reaction (MP-PCR). Amplification of each locus requires that two of the primers bind to the forward and r...
متن کاملHow To Make a Greedy Heuristic for the Asymmetric Traveling Salesman Problem Competitive
It is widely confirmed by many computational experiments that a greedy type heuristics for the Traveling Salesman Problem (TSP) produces rather poor solutions except for the Euclidean TSP. The selection of arcs to be included by a greedy heuristic is usually done on the base of cost values. We propose to use upper tolerances of an optimal solution to one of the relaxed Asymmetric TSP (ATSP) to ...
متن کاملWavelength Assignment in Optical Network Design
We consider a flexible greedy approach to wavelength assignment in an optical network with the goal of minimizing the cost incurred by wavelength conversions and fiber deployment. The greedy approach processes demands one by one in a certain order and makes a locally optimal choice for each demand. We address several heuristics for creating desirable demand orderings, including a random orderin...
متن کاملEnsemble Classification and Extended Feature Selection for Credit Card Fraud Detection
Due to the rise of technology, the possibility of fraud in different areas such as banking has been increased. Credit card fraud is a crucial problem in banking and its danger is over increasing. This paper proposes an advanced data mining method, considering both feature selection and decision cost for accuracy enhancement of credit card fraud detection. After selecting the best and most effec...
متن کامل